Improve ConfigNode leader warm-up before serving by CRZbulabula · Pull Request #17821 · apache/iotdb

CRZbulabula · 2026-06-02T12:07:47Z

Summary

Gate ConfigNode leader confirmation on LoadCache warm-up after consensus leader-ready.
Track first heartbeat coverage for Nodes, Regions, RegionGroups, and ConsensusGroups before serving requests.
Return CONFIG_NODE_LEADER_WARMING_UP during warm-up so DataNodes wait and retry the current ConfigNode instead of treating it as redirection.

Tests

mvn spotless:apply -pl iotdb-core/confignode,iotdb-core/datanode,iotdb-client/service-rpc
mvn compile -pl iotdb-client/service-rpc,iotdb-core/confignode
mvn test -pl iotdb-core/confignode -Dtest=LoadManagerTest
mvn compile -pl iotdb-client/service-rpc,iotdb-core/datanode (fails in unrelated existing sources: ArrayDeviceTimeIndex.java and TableDeviceSchemaCache.java still pass IDeviceID to PartialPath.matchFullPath)

codecov · 2026-06-02T12:51:22Z

Codecov Report

❌ Patch coverage is 21.62162% with 232 lines in your changes missing coverage. Please review.
✅ Project coverage is 40.69%. Comparing base (c3e74a2) to head (df8ce33).
⚠️ Report is 1 commits behind head on master.

Files with missing lines	Patch %	Lines
...nsensus/statemachine/ConfigRegionStateMachine.java	7.31%	114 Missing ⚠️
...c/handlers/heartbeat/DataNodeHeartbeatHandler.java	0.00%	41 Missing ⚠️
...che/iotdb/db/protocol/client/ConfigNodeClient.java	0.00%	27 Missing ⚠️
...confignode/manager/consensus/ConsensusManager.java	3.70%	26 Missing ⚠️
...che/iotdb/confignode/manager/load/LoadManager.java	74.35%	10 Missing ⚠️
...che/iotdb/confignode/manager/ProcedureManager.java	0.00%	6 Missing ⚠️
...rg/apache/iotdb/confignode/service/ConfigNode.java	0.00%	3 Missing ⚠️
...apache/iotdb/confignode/manager/ConfigManager.java	0.00%	2 Missing ⚠️
...iotdb/confignode/manager/load/cache/LoadCache.java	88.23%	2 Missing ⚠️
.../iotdb/confignode/procedure/ProcedureExecutor.java	83.33%	1 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##             master   #17821      +/-   ##
============================================
+ Coverage     40.54%   40.69%   +0.14%     
+ Complexity     2622     2621       -1     
============================================
  Files          5244     5244              
  Lines        362367   362567     +200     
  Branches      46651    46678      +27     
============================================
+ Hits         146938   147552     +614     
+ Misses       215429   215015     -414

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Caideyipi

I reviewed the warm-up changes on a57680d2542. I think there are a few issues that should be fixed before merge:

AINode treats the new warm-up status as a hard failure. ConfigManager.registerAINode() now returns CONFIG_NODE_LEADER_WARMING_UP while confirmLeader() is warming up, but the Python AINode client only treats REDIRECTION_RECOMMEND as retryable in _update_config_node_leader(). node_register() / node_restart() then call verify_success() and raise on status 1014, so an AINode can fail startup if it hits the leader during warm-up. Please add the new code to the AINode constants and retry handling paths.
Non-seed ConfigNode registration has the same gap. registerConfigNode() can now return CONFIG_NODE_LEADER_WARMING_UP, but ConfigNode.sendRegisterConfigNodeRequest() only retries success/redirection/internal-retry statuses and throws StartupException for anything else. A ConfigNode joining during leader warm-up can fail immediately instead of waiting and retrying.
The async leader-service startup has a stepdown race. notifyLeaderReady() now submits startLeaderServicesAfterLoadReady() asynchronously. That task checks isLeaderReady() only once before starting leader-only services and setting leaderServicesReady=true. If notifyNotLeader() runs after that check but before/during service startup, the old task can re-enable services after cleanup. Please guard this with a leader epoch/cancellation token, and re-check before setting leaderServicesReady.
The DataNode register retry budget is too tight for the 30s warm-up tolerance. On CONFIG_NODE_LEADER_WARMING_UP, updateConfigNodeLeader() sleeps 2s and returns retryable, while registerDataNode() has 15 attempts. The final request can still happen before the 30s tolerance expires, then sleep and exit without one post-tolerance attempt. A deadline-based retry or a larger retry budget would avoid this edge case.

Copilot

Pull request overview

This PR improves ConfigNode leader “warm-up” semantics so DataNodes avoid premature redirection during leader transitions, and ConfigNode serving is gated on initial heartbeat sampling readiness.

Changes:

Add a dedicated CONFIG_NODE_LEADER_WARMING_UP status and have DataNodes wait/retry the current leader during warm-up.
Introduce LoadManager.isLoadReady() and a 30s tolerance window to require first heartbeat coverage (ConfigNode/DataNode) before considering load services ready.
Track consensus-group heartbeat sampling coverage and add tests for warm-up readiness behavior.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
iotdb-core/datanode/src/main/java/org/apache/iotdb/db/protocol/client/ConfigNodeClient.java	Treat `CONFIG_NODE_LEADER_WARMING_UP` as a wait-and-retry instead of redirection.
iotdb-core/confignode/src/test/java/org/apache/iotdb/confignode/manager/load/LoadManagerTest.java	Add tests validating load warm-up readiness criteria and 30s tolerance behavior.
iotdb-core/confignode/src/main/java/org/apache/iotdb/confignode/manager/load/LoadManager.java	Add load readiness state machine (`isLoadReady`, reason strings, tolerance window).
iotdb-core/confignode/src/main/java/org/apache/iotdb/confignode/manager/load/cache/LoadCache.java	Track consensus-group sampled nodes; add node-heartbeat unready reasons; cache “unreported” samples.
iotdb-core/confignode/src/main/java/org/apache/iotdb/confignode/manager/load/cache/consensus/ConsensusGroupCache.java	Always update consensus stats from last sample (including “unready leader”).
iotdb-core/confignode/src/main/java/org/apache/iotdb/confignode/manager/load/cache/AbstractLoadCache.java	Add `hasHeartbeatSample()` helper.
iotdb-core/confignode/src/main/java/org/apache/iotdb/confignode/manager/consensus/ConsensusManager.java	Gate leader confirmation on consensus-ready + leader-services-ready + load-ready; return warming-up status.
iotdb-core/confignode/src/main/java/org/apache/iotdb/confignode/consensus/statemachine/ConfigRegionStateMachine.java	Track `leaderServicesReady` and start load services before leader services.
iotdb-core/confignode/src/main/java/org/apache/iotdb/confignode/client/async/handlers/heartbeat/DataNodeHeartbeatHandler.java	Improve null-safety and cache consensus/region samples; add missing-region sampling on partial reports.
iotdb-core/confignode/src/main/java/org/apache/iotdb/confignode/client/async/handlers/heartbeat/ConfigNodeHeartbeatHandler.java	On error, force-update node cache to Unknown (no connection-broken check).
iotdb-client/service-rpc/src/main/java/org/apache/iotdb/rpc/TSStatusCode.java	Add `CONFIG_NODE_LEADER_WARMING_UP(1014)`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

CRZbulabula · 2026-06-09T04:43:52Z

@Caideyipi Thanks for the detailed review. Fixed in the latest commits, especially f54b484.

AINode registration now treats CONFIG_NODE_LEADER_WARMING_UP as retryable instead of a hard failure.
Non-seed ConfigNode registration now waits and retries when the leader returns CONFIG_NODE_LEADER_WARMING_UP.
ConfigRegionStateMachine now uses a leader-services epoch guard, serializes startup and cleanup with leaderServicesLock, and re-checks the epoch before marking leader services ready.
DataNode registration now uses a 60s warm-up retry deadline, so it has requests after the 30s first-heartbeat tolerance.

I also cleaned up the follow-up warm-up sampling concerns: removed the unreported DataNode Region heartbeat chain, removed the extra DataNodeHeartbeatHandler region-group argument, and kept consensus sampling to only cache leader samples when the DataNode reports leader=true with a consensus logical timestamp.

Caideyipi

I still see one remaining stepdown race that I think should be fixed before merge.

ConfigRegionStateMachine.submitIfLeaderServicesEpochCurrent() checks the epoch only before invoking task.run(). If an async task passes that check, then notifyNotLeader() runs and finishes cleanup, the old task can resume and start leader-only services such as startCQScheduler(), startPipeMetaSync(), startPipeHeartbeat(), or startSubscriptionMetaSync() after the node is no longer leader.

The main startup path is guarded by leaderServicesLock, but these submitted tasks are not serialized with cleanup. Please either run the task under leaderServicesLock and re-check isCurrentLeaderServicesEpoch(epoch) inside the lock, or pass a cancellation/epoch guard into the individual service start paths.

CRZbulabula · 2026-06-09T10:22:58Z

@Caideyipi Good catch — thanks. Fixed in e074337 by reworking the leader-services lifecycle so this race can no longer happen.

The root cause was that submitIfLeaderServicesEpochCurrent() only checked the epoch before task.run(), and those submitted tasks were not serialized against notifyNotLeader()'s cleanup. I removed that helper entirely. The new design:

All transitions are serialized on a single-thread executor. notifyLeaderReady (become-leader), notifyNotLeader / notifyLeaderChanged (resign) all submit to one single-thread leaderServicesTransitionExecutor. Because it has exactly one worker, a become-leader orchestration and a resign cleanup can never run concurrently — one runs to completion before the other starts. So startCQScheduler() / startPipeMetaSync() / startPipeHeartbeat() / startSubscriptionMetaSync() can no longer interleave with cleanup.
The epoch is bumped eagerly on resign, before cleanup is even queued. notifyNotLeader calls invalidateLeaderServices() synchronously on the consensus thread, so the epoch advances the instant we lose leadership. An in-flight becomeLeader re-checks isCurrentLeaderServicesEpoch(epoch) after the parallel startups join and again before it sets leaderServicesReady = true, so a stale epoch bails out and never re-enables services after cleanup.
leaderServicesReady is only set inside leaderServicesLock with the epoch re-checked, so the "set ready" step is atomic with respect to the epoch.

Within a single become-leader epoch, load services still start first (for warm-up), then the remaining independent services start in parallel on a cached pool and are joined before the epoch is marked ready. So the check-then-run gap you pointed out is closed both by the single-thread serialization and by the epoch re-check inside the lock.

Refactor ConfigRegionStateMachine so leader become/resign transitions are strictly serial. All transitions (notifyLeaderReady / notifyNotLeader / notifyLeaderChanged) are submitted to a single-thread transition executor, which is the barrier that keeps epochs serial: one transition's orchestration runs to completion before the next begins. Within a become-leader epoch, load services start first to warm up as early as possible, then the remaining independent leader services start in parallel on a cached pool and are joined before the epoch is marked ready. The epoch is bumped eagerly on resign so an in-flight startup detects it is stale and bails out before re-enabling services after cleanup. This removes the giant lock-wrapped startLeaderServices method and the per-task submitIfLeaderServicesEpochCurrent helper; leaderServicesLock now only guards the (epoch, ready) pair.

This reverts commit 5c69640.

sonarqubecloud · 2026-06-09T11:45:54Z

Quality Gate passed

Issues
4 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
1.9% Duplication on New Code

See analysis details on SonarQube Cloud

Caideyipi reviewed Jun 5, 2026

View reviewed changes

CRZbulabula requested a review from Copilot June 8, 2026 01:47

Copilot started reviewing on behalf of CRZbulabula June 8, 2026 01:47 View session

Copilot AI reviewed Jun 8, 2026

View reviewed changes

Comment thread ...va/org/apache/iotdb/confignode/client/async/handlers/heartbeat/DataNodeHeartbeatHandler.java Outdated

Comment thread ...-core/confignode/src/main/java/org/apache/iotdb/confignode/manager/load/cache/LoadCache.java Outdated

Caideyipi reviewed Jun 9, 2026

View reviewed changes

CRZbulabula added 9 commits June 9, 2026 19:31

Improve ConfigNode leader warm-up gating

59028b2

Refine ConfigNode leader warm-up readiness

17e9c3b

Fix ConfigNode leader warm-up recovery

f724ea4

Simplify consensus leader warm-up sampling

a0fcc1d

Refine ConfigNode leader warm-up flow

bac0c60

Clean up leader warm-up heartbeat flow

8f9f175

Fix Sonar hotspot in simple consensus log parsing

599a9be

Update ConfigRegionStateMachine.java

5c69640

CRZbulabula force-pushed the improve-confignode-leader-confirm branch from 7408a91 to 5c69640 Compare June 9, 2026 11:31

CRZbulabula added 2 commits June 9, 2026 19:37

Update ConsensusManager.java

cb63667

Revert "Update ConfigRegionStateMachine.java"

df8ce33

This reverts commit 5c69640.

CRZbulabula merged commit ddd8faa into master Jun 10, 2026
45 checks passed

CRZbulabula deleted the improve-confignode-leader-confirm branch June 10, 2026 06:31

CRZbulabula mentioned this pull request Jun 10, 2026

Catch per-startup failures during ConfigNode leader warm-up #17898

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve ConfigNode leader warm-up before serving#17821

Improve ConfigNode leader warm-up before serving#17821
CRZbulabula merged 11 commits into
masterfrom
improve-confignode-leader-confirm

CRZbulabula commented Jun 2, 2026

Uh oh!

codecov Bot commented Jun 2, 2026 •

edited

Loading

Uh oh!

Caideyipi left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

CRZbulabula commented Jun 9, 2026

Uh oh!

Caideyipi left a comment

Uh oh!

CRZbulabula commented Jun 9, 2026

Uh oh!

sonarqubecloud Bot commented Jun 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

CRZbulabula commented Jun 2, 2026

Summary

Tests

Uh oh!

codecov Bot commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Caideyipi left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

CRZbulabula commented Jun 9, 2026

Uh oh!

Caideyipi left a comment

Choose a reason for hiding this comment

Uh oh!

CRZbulabula commented Jun 9, 2026

Uh oh!

sonarqubecloud Bot commented Jun 9, 2026

Quality Gate passed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov Bot commented Jun 2, 2026 •

edited

Loading